220 research outputs found

    Graphle: Interactive exploration of large, dense graphs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A wide variety of biological data can be modeled as network structures, including experimental results (e.g. protein-protein interactions), computational predictions (e.g. functional interaction networks), or curated structures (e.g. the Gene Ontology). While several tools exist for visualizing large graphs at a global level or small graphs in detail, previous systems have generally not allowed interactive analysis of dense networks containing thousands of vertices at a level of detail useful for biologists. Investigators often wish to explore specific portions of such networks from a detailed, gene-specific perspective, and balancing this requirement with the networks' large size, complex structure, and rich metadata is a substantial computational challenge.</p> <p>Results</p> <p>Graphle is an online interface to large collections of arbitrary undirected, weighted graphs, each possibly containing tens of thousands of vertices (e.g. genes) and hundreds of millions of edges (e.g. interactions). These are stored on a centralized server and accessed efficiently through an interactive Java applet. The Graphle applet allows a user to examine specific portions of a graph, retrieving the relevant neighborhood around a set of query vertices (genes). This neighborhood can then be refined and modified interactively, and the results can be saved either as publication-quality images or as raw data for further analysis. The Graphle web site currently includes several hundred biological networks representing predicted functional relationships from three heterogeneous data integration systems: <it>S. cerevisiae </it>data from bioPIXIE, <it>E. coli </it>data using MEFIT, and <it>H. sapiens </it>data from HEFalMp.</p> <p>Conclusions</p> <p>Graphle serves as a search and visualization engine for biological networks, which can be managed locally (simplifying collaborative data sharing) and investigated remotely. The Graphle framework is freely downloadable and easily installed on new servers, allowing any lab to quickly set up a Graphle site from which their own biological network data can be shared online.</p

    Paving the Way Towards a Successful and Fulfilling Career in Computational Biology

    Get PDF
    Most of us will spend a significant amount of time and effort throughout our lives in improving our career. The decisions we make shape how our career progresses, and the right decisions can ensure it is successful and fulfilling. Early decisions can have a strong influence, especially in today’s competitive job market, where a university degree will not guarantee the best job. It is vital these early decisions are well informed and based on access to as much information as possible. As part of an effort to ensure that computational biologists and students are guided into the right career paths, the Regional Student Group (RSG) program, an arm of the International Society for Computational Biology (ISCB), has provided a range of activities to assist computational biologists and bioinformatics researchers in their career development. These include organizing prac�tical workshops and seminars presented by leading experts on how to broaden the scope of career options and guarantee success. This article provides insight on some of these activities and highlights the benefits gained through the shared experiences of RSGs in running career-related activities

    Visualization methods for statistical analysis of microarray clusters

    Get PDF
    BACKGROUND: The most common method of identifying groups of functionally related genes in microarray data is to apply a clustering algorithm. However, it is impossible to determine which clustering algorithm is most appropriate to apply, and it is difficult to verify the results of any algorithm due to the lack of a gold-standard. Appropriate data visualization tools can aid this analysis process, but existing visualization methods do not specifically address this issue. RESULTS: We present several visualization techniques that incorporate meaningful statistics that are noise-robust for the purpose of analyzing the results of clustering algorithms on microarray data. This includes a rank-based visualization method that is more robust to noise, a difference display method to aid assessments of cluster quality and detection of outliers, and a projection of high dimensional data into a three dimensional space in order to examine relationships between clusters. Our methods are interactive and are dynamically linked together for comprehensive analysis. Further, our approach applies to both protein and gene expression microarrays, and our architecture is scalable for use on both desktop/laptop screens and large-scale display devices. This methodology is implemented in GeneVAnD (Genomic Visual ANalysis of Datasets) and is available at . CONCLUSION: Incorporating relevant statistical information into data visualizations is key for analysis of large biological datasets, particularly because of high levels of noise and the lack of a gold-standard for comparisons. We developed several new visualization techniques and demonstrated their effectiveness for evaluating cluster quality and relationships between clusters

    Integrated functional networks of process, tissue, and developmental stage specific interactions in Arabidopsis thaliana

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent years have seen an explosion in plant genomics, as the difficulties inherent in sequencing and functionally analyzing these biologically and economically significant organisms have been overcome. <it>Arabidopsis thaliana</it>, a versatile model organism, represents an opportunity to evaluate the predictive power of biological network inference for plant functional genomics.</p> <p>Results</p> <p>Here, we provide a compendium of functional relationship networks for <it>Arabidopsis thaliana </it>leveraging data integration based on over 60 microarray, physical and genetic interaction, and literature curation datasets. These include tissue, biological process, and development stage specific networks, each predicting relationships specific to an individual biological context. These biological networks enable the rapid investigation of uncharacterized genes in specific tissues and developmental stages of interest and summarize a very large collection of <it>A. thaliana </it>data for biological examination. We found validation in the literature for many of our predicted networks, including those involved in disease resistance, root hair patterning, and auxin homeostasis.</p> <p>Conclusions</p> <p>These context-specific networks demonstrate that highly specific biological hypotheses can be generated for a diversity of individual processes, developmental stages, and plant tissues in <it>A. thaliana</it>. All predicted functional networks are available online at <url>http://function.princeton.edu/arathGraphle</url>.</p

    GOLEM: an interactive graph-based gene-ontology navigation and analysis tool

    Get PDF
    BACKGROUND: The Gene Ontology has become an extremely useful tool for the analysis of genomic data and structuring of biological knowledge. Several excellent software tools for navigating the gene ontology have been developed. However, no existing system provides an interactively expandable graph-based view of the gene ontology hierarchy. Furthermore, most existing tools are web-based or require an Internet connection, will not load local annotations files, and provide either analysis or visualization functionality, but not both. RESULTS: To address the above limitations, we have developed GOLEM (Gene Ontology Local Exploration Map), a visualization and analysis tool for focused exploration of the gene ontology graph. GOLEM allows the user to dynamically expand and focus the local graph structure of the gene ontology hierarchy in the neighborhood of any chosen term. It also supports rapid analysis of an input list of genes to find enriched gene ontology terms. The GOLEM application permits the user either to utilize local gene ontology and annotations files in the absence of an Internet connection, or to access the most recent ontology and annotation information from the gene ontology webpage. GOLEM supports global and organism-specific searches by gene ontology term name, gene ontology id and gene name. CONCLUSION: GOLEM is a useful software tool for biologists interested in visualizing the local directed acyclic graph structure of the gene ontology hierarchy and searching for gene ontology terms enriched in genes of interest. It is freely available both as an application and as an applet at

    Finding function: evaluation methods for functional genomic data

    Get PDF
    BACKGROUND: Accurate evaluation of the quality of genomic or proteomic data and computational methods is vital to our ability to use them for formulating novel biological hypotheses and directing further experiments. There is currently no standard approach to evaluation in functional genomics. Our analysis of existing approaches shows that they are inconsistent and contain substantial functional biases that render the resulting evaluations misleading both quantitatively and qualitatively. These problems make it essentially impossible to compare computational methods or large-scale experimental datasets and also result in conclusions that generalize poorly in most biological applications. RESULTS: We reveal issues with current evaluation methods here and suggest new approaches to evaluation that facilitate accurate and representative characterization of genomic methods and data. Specifically, we describe a functional genomics gold standard based on curation by expert biologists and demonstrate its use as an effective means of evaluation of genomic approaches. Our evaluation framework and gold standard are freely available to the community through our website. CONCLUSION: Proper methods for evaluating genomic data and computational approaches will determine how much we, as a community, are able to learn from the wealth of available data. We propose one possible solution to this problem here but emphasize that this topic warrants broader community discussion

    Discovery of biological networks from diverse functional genomic data

    Get PDF
    We have developed a general probabilistic system for query-based discovery of pathway-specific networks through integration of diverse genome-wide data. This framework was validated by accurately recovering known networks for 31 biological processes in Saccharomyces cerevisiae and experimentally verifying predictions for the process of chromosomal segregation. Our system, bioPIXIE, a public, comprehensive system for integration, analysis, and visualization of biological network predictions for S. cerevisiae, is freely accessible over the worldwide web
    corecore